MixedGrad: An O(1/T) Convergence Rate Algorithm for Stochastic Smooth Optimization

نویسندگان

  • Mehrdad Mahdavi
  • Rong Jin
چکیده

It is well known that the optimal convergence rate for stochastic optimization of smooth functions is O(1/ √ T ), which is same as stochastic optimization of Lipschitz continuous convex functions. This is in contrast to optimizing smooth functions using full gradients, which yields a convergence rate of O(1/T ). In this work, we consider a new setup for optimizing smooth functions, termed as Mixed Optimization, which allows to access both a stochastic oracle and a full gradient oracle. Our goal is to significantly improve the convergence rate of stochastic optimization of smooth functions by having an additional small number of accesses to the full gradient oracle. We show that, with an O(ln T ) calls to the full gradient oracle and an O(T ) calls to the stochastic oracle, the proposed mixed optimization algorithm is able to achieve an optimization error of O(1/T ).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixed Optimization for Smooth Functions

It is well known that the optimal convergence rate for stochastic optimization of smooth functions is O(1/ √ T ), which is same as stochastic optimization of Lipschitz continuous convex functions. This is in contrast to optimizing smooth functions using full gradients, which yields a convergence rate of O(1/T ). In this work, we consider a new setup for optimizing smooth functions, termed asMix...

متن کامل

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization problems which arise in machine learning. For strongly convex problems, its convergence rate was known to be O(log(T )/T ), by running SGD for T iterations and returning the average point. However, recent results showed that using a different algorithm, one can get an optimal O(1/T ) rate. This mig...

متن کامل

Zeroth-order Asynchronous Doubly Stochastic Algorithm with Variance Reduction

Zeroth-order (derivative-free) optimization attracts a lot of attention in machine learning, because explicit gradient calculations may be computationally expensive or infeasible. To handle large scale problems both in volume and dimension, recently asynchronous doubly stochastic zeroth-order algorithms were proposed. The convergence rate of existing asynchronous doubly stochastic zeroth order ...

متن کامل

O(logT) Projections for Stochastic Optimization of Smooth and Strongly Convex Functions

Traditional algorithms for stochastic optimization require projecting the solution at each iteration into a given domain to ensure its feasibility. When facing complex domains, such as the positive semidefinite cone, the projection operation can be expensive, leading to a high computational cost per iteration. In this paper, we present a novel algorithm that aims to reduce the number of project...

متن کامل

Fast Stochastic Variance Reduced ADMM for Stochastic Composition Optimization

We consider the stochastic composition optimization problem proposed in [17], which has applications ranging from estimation to statistical and machine learning. We propose the first ADMM-based algorithm named com-SVRADMM, and show that com-SVR-ADMM converges linearly for strongly convex and Lipschitz smooth objectives, and has a convergence rate of O(logS/S), which improves upon the O(S−4/9) r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1307.7192  شماره 

صفحات  -

تاریخ انتشار 2013